shacl shape
Overcoming the Generalization Limits of SLM Finetuning for Shape-Based Extraction of Datatype and Object Properties
Ringwald, Célian, Gandon, Fabien, Faron, Catherine, Michel, Franck, Akl, Hanna Abi
Small language models (SLMs) have shown promises for relation extraction (RE) when extracting RDF triples guided by SHACL shapes focused on common datatype properties. This paper investigates how SLMs handle both datatype and object properties for a complete RDF graph extraction. We show that the key bottleneck is related to long-tail distribution of rare properties. To solve this issue, we evaluate several strategies: stratified sampling, weighted loss, dataset scaling, and template-based synthetic data augmentation. We show that the best strategy to perform equally well over unbalanced target properties is to build a training set where the number of occurrences of each property exceeds a given threshold. To enable reproducibility, we publicly released our datasets, experimental results and code. Our findings offer practical guidance for training shape-aware SLMs and highlight promising directions for future work in semantic RE.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Europe > France > Provence-Alpes-Côte d'Azur (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (4 more...)
Automated Validation of Textual Constraints Against AutomationML via LLMs and SHACL
Westermann, Tom, Köcher, Aljosha, Gehlhoff, Felix
AutomationML (AML) enables standardized data exchange in engineering, yet existing recommendations for proper AML modeling are typically formulated as informal and textual constraints. These constraints cannot be validated automatically within AML itself. This work-in-progress paper introduces a pipeline to formalize and verify such constraints. First, AML models are mapped to OWL ontologies via RML and SPARQL. In addition, a Large Language Model translates textual rules into SHACL constraints, which are then validated against the previously generated AML ontology. Finally, SHACL validation results are automatically interpreted in natural language. The approach is demonstrated on a sample AML recommendation. Results show that even complex modeling rules can be semi-automatically checked -- without requiring users to understand formal methods or ontology technologies.
- Workflow (0.95)
- Research Report > New Finding (0.34)
From Instructions to ODRL Usage Policies: An Ontology Guided Approach
Mustafa, Daham M., Nadgeri, Abhishek, Collarana, Diego, Arnold, Benedikt T., Quix, Christoph, Lange, Christoph, Decker, Stefan
This study presents an approach that uses large language models such as GPT-4 to generate usage policies in the W3C Open Digital Rights Language ODRL automatically from natural language instructions. Our approach uses the ODRL ontology and its documentation as a central part of the prompt. Our research hypothesis is that a curated version of existing ontology documentation will better guide policy generation. We present various heuristics for adapting the ODRL ontology and its documentation to guide an end-to-end KG construction process. We evaluate our approach in the context of dataspaces, i.e., distributed infrastructures for trustworthy data exchange between multiple participating organizations for the cultural domain. We created a benchmark consisting of 12 use cases of varying complexity. Our evaluation shows excellent results with up to 91.95% accuracy in the resulting knowledge graph.
- South America > Bolivia > Cochabamba Department > Cercado Province (Cochabamba) > Cochabamba (0.04)
- North America > United States (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- (3 more...)
- Overview (0.94)
- Research Report (0.82)
Common Foundations for SHACL, ShEx, and PG-Schema
Ahmetaj, S., Boneva, I., Hidders, J., Hose, K., Jakubowski, M., Labra-Gayo, J. E., Martens, W., Mogavero, F., Murlak, F., Okulmus, C., Polleres, A., Savkovic, O., Simkus, M., Tomaszuk, D.
Graphs have emerged as an important foundation for a variety of applications, including capturing and reasoning over factual knowledge, semantic data integration, social networks, and providing factual knowledge for machine learning algorithms. To formalise certain properties of the data and to ensure data quality, there is a need to describe the schema of such graphs. Because of the breadth of applications and availability of different data models, such as RDF and property graphs, both the Semantic Web and the database community have independently developed graph schema languages: SHACL, ShEx, and PG-Schema. Each language has its unique approach to defining constraints and validating graph data, leaving potential users in the dark about their commonalities and differences. In this paper, we provide formal, concise definitions of the core components of each of these schema languages. We employ a uniform framework to facilitate a comprehensive comparison between the languages and identify a common set of functionalities, shedding light on both overlapping and distinctive features of the three languages.
- Europe > Austria > Vienna (0.14)
- Europe > Germany > Bavaria > Upper Franconia > Bayreuth (0.04)
- Europe > Poland > Podlaskie Province > Bialystok (0.04)
- (6 more...)
Interview with Célian Ringwald: Natural language processing and knowledge graphs
The AAAI/SIGAI Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. This year, 30 students have been selected for this programme, and we'll be hearing from them over the course of the next few months. In this interview, Célian Ringwald, tells us about his work on natural language processing and knowledge graphs. I am a PhD student at the Université Côte d'Azur in Inria, the French Institute in Research in AI. I am part of the Wimmics team, a research group bridging formal semantics and social semantics on the web.
- Europe > France > Provence-Alpes-Côte d'Azur (0.25)
- Europe > Belgium (0.05)
- Personal > Interview (0.35)
- Instructional Material > Course Syllabus & Notes (0.35)
From Shapes to Shapes: Inferring SHACL Shapes for Results of SPARQL CONSTRUCT Queries (Extended Version)
Seifer, Philipp, Hernández, Daniel, Lämmel, Ralf, Staab, Steffen
SPARQL CONSTRUCT queries allow for the specification of data processing pipelines that transform given input graphs into new output graphs. It is now common to constrain graphs through SHACL shapes allowing users to understand which data they can expect and which not. However, it becomes challenging to understand what graph data can be expected at the end of a data processing pipeline without knowing the particular input data: Shape constraints on the input graph may affect the output graph, but may no longer apply literally, and new shapes may be imposed by the query template. In this paper, we study the derivation of shape constraints that hold on all possible output graphs of a given SPARQL CONSTRUCT query. We assume that the SPARQL CONSTRUCT query is fixed, e.g., being part of a program, whereas the input graphs adhere to input shape constraints but may otherwise vary over time and, thus, are mostly unknown. We study a fragment of SPARQL CONSTRUCT queries (SCCQ) and a fragment of SHACL (Simple SHACL). We formally define the problem of deriving the most restrictive set of Simple SHACL shapes that constrain the results from evaluating a SCCQ over any input graph restricted by a given set of Simple SHACL shapes. We propose and implement an algorithm that statically analyses input SHACL shapes and CONSTRUCT queries and prove its soundness and complexity.
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Hampshire > Southampton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Compliance checking in reified IO logic via SHACL
Robaldo, Livio, Adebayo, Kolawole J.
Reified Input/Output (I/O) logic[21] has been recently proposed to model real-world norms in terms of the logic in [11]. This is massively grounded on the notion of reification, and it has specifically designed to model meaning of natural language sentences, such as the ones occurring in existing legislation. This paper presents a methodology to carry out compliance checking on reified I/O logic formulae. These are translated in SHACL (Shapes Constraint Language) shapes, a recent W3C recommendation to validate and reason with RDF triplestores. Compliance checking is then enforced by validating RDF graphs describing states of affairs with respect to these SHACL shapes.
- Information Technology > Security & Privacy (0.51)
- Law > Statutes (0.48)